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METHOD OF IMPLEMENTING LOGIC FUNCTIONS 
USING A LOOK-UP-TABLE 

Field of the Invention 

The present invention relates to a method and/or 
architecture for implementing functions generally and, more 
particularly, to a method and/or architecture for implementing 
logic functions implemented in a look-up-table (LUT) . 

Background of the Invention 

Multipliers can be implemented in programmable logic 
devices implementing logic, memory or a combination thereof. 
Multipliers can be implemented in a memory using a look-up- table 
(LUT) . The contents of the LUT, when implemented for a multiplier, 
are typically written when the device is programmed and are not 
changed. Recently, programmable logic devices with multiple port 
memories have become available. The multiple port memories allow 
a user to perform multiple reads from the same memory in parallel. 

Conventional approaches for providing programmable logic 
multipliers in a memory implement a single port memory, containing 
a look-up table (LUT) of results. The results are partial products 
of addresses input to the memory. For each LUT, one partial 
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product can be generated per clock cycle. If a user desires to 
generate multiple partial products per clock cycle (i.e., typical 
of fast multipliers) , the user must implement multiple LUTs 
(therefore multiple programmable devices) . The multiple LUTs 
require additional memory resources including area. 

Conventional approaches can only read one partial product 
per LUT, per clock cycle. To provide fast multipliers in 
conventional approaches, several LUTs must be implemented in 
parallel. Furthermore, conventional approaches require additional 
device resources. 

Summary of the Invention 

The present invention concerns an apparatus comprising 
one or more look-up- tables (LUTs) . The LUTs may be configured to 
provide logical functions. The one or more LUTs are generally 
implemented within a multiport memory. 

The objects, features and advantages of the present 
invention include providing a method and/or architecture for 
implementing arithmetic and other logic functions that may (i) 
allow a single multi-port memory to be implemented to generate 
several partial products per clock cycle, (ii) allow fewer look-up- 
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tables (LUTs) to be implemented in order to generate a given number 
of partial products, (iii) provide efficient utilization of 
resources of a programmable device and/or (iv) allow a designer to 
implement a design in a smaller and cheaper device. 

Brief Description of the Drawings 

These and other objects, features and advantages of the 
present invention will be apparent from the following detailed 
description and the appended claims and drawings in which: 

FIG. 1 is a block diagram of a preferred embodiment of 
the present invention; 

FIG. 2 is a block diagram of another preferred embodiment 
of the present invention; and 

FIG. 3 is a timing diagram illustrating an operation of 
the present invention. 

Detailed Description of the Preferred Embodiments 

Referring to FIG. 1, a block diagram of circuit 100 is 
shown in accordance with a preferred embodiment of the present 
invention. The circuit 100 may provide an efficient implementation 
of multipliers in a multi-port memory. The circuit 100 may provide 
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implementation of logical (e.g., arithmetic or other logic) 
functions using multi-port memories. The circuit 100 may provide 
a implementation of logic functions using a mult i -port memory that 
may be area efficient. Additionally, the circuit 100 may allow 
increased (e.g,, faster) performance for look-up- table (LUT) based 
multipliers. For example, the circuit 100 may implement a pipeline 
configuration (to be discussed in connection with FIGS. 2 and 3). 
In one example, the circuit 100 may be implemented as a multi-port 
multiplier. In another example, the circuit 100 may be implemented 
as a multi-port LUT based multiplier. 

The structure of the circuit 100 generally comprises a 
number of look-up- tables (LUTs) 102a-102n, an adder block (or 
circuit) 104 and a result block (or circuit) 106. The LUTs 102a- 
102n may be implemented as serial and/or parallel devices. 
Additionally, a particular number of LUTs 102a-102n may be 
dependent on a structure of the circuit 100, However, the circuit 
10 0 may implement, preferably, a single LUT, In one example, each 
of the LUTs 102a-102n may be implemented as a port looking into 
multiple LUTs, In another example, the LUTs 102a- 102n may be 
implemented as a port of a single LUT, The number of LUTs 102a-102n 
may be related to a number of ports of a mult i -port memory. 
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However, each of the LUTs 102a-102n may be implemented as another 
appropriate device (e.g., single port memory) and/or configuration 
in order to meet the criteria of a particular implementation. 
Additionally, a particular number of LUTs 102a-102n may be varied 
in order to meet the criteria of a particular implementation. 

The LUTs 102a-102n may be implemented in order to 
generate partial products in a multi-port memory. In one example, 
the circuit 100 may be implemented as a multi-port memory. In 
another example, the circuit 100 may be implemented as a ROM, RAM, 
PROM, EPROM, EEPROM, flash memory or other appropriate memory 
device to meet the design criteria of a particular implementation. 

Each of the LUTs 102a-102n may have an input 110 and an 
input 112, The LUTs 102a-102n may receive a number of signals 
(e.g., INa-INn) at the inputs llOa-llOn and the inputs 112a-112n. 
For example, the LUT 102a may have the input 110a that may receive 
the signal INb and the input 112a that may receive the signal INd. 
The LUT 102a may perform arithmetic or other logic functions with 
the signal INb and the signal INd. In one example, the signals 
INa-INn may be implemented as address signals. In another example, 
the signals INa-INn may be implemented as 8 -bit or another 
appropriate bit size signals. In another example, the signals INa- 
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INn may be implemented as multi-bit and/or single-bit signals in a 
serial or parallel configuration. Additionally, the particular 
number of signals INa-INn may be implemented dependent upon a 
particular configuration of the circuit 100. For example, the 
input signals INa and Inb may be concatenated on a single input 
line. However, the input signals INa -INn may be presented to the 
circuit 100 as another appropriate multi-bit and/or single-bit 
serial or parallel combination in order to meet the criteria of a 
particular implementation. 

Each of the LUTs 102a- 102n may also have an output 114a- 
114n that may present a signal (e.g., RESULTa-RESULTn) , 
respectively. Each of the signals RESULTa-RESULTn may be presented 
to an input 116a- 115n of the adder block 104. In one example, the 
signals RESULTa-RESULTn may each be implemented as a partial 
product. In another example, the signals RESULTa-RESULTn may be 
implemented as 16-bit signals. In another example, the signals 
RESULTa-RESULTn may each be implemented as a multi-bit and/or 
single-bit signal in a parallel or serial configuration. The LUTs 
102a- 102n may be implemented to perform arithmetic or other logic 
functions on the partial products RESULTa-RESULTn. 
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The signals RESULTa-RESULTn may be presented to a nutmber 
of inputs 116a-llGn of the adder block 104, respectively. 
Additionally, the adder 104 may have a number of inputs 118a-118n 
that may receive a number of signals (e.g. , SHIFTa-SHIFTn) . In one 
example, the signals SHIFTa-SHIFTn may be implemented as inputs to 
carry chains. In another example, the signals SHIFTa-SHIFTn may be 
implemented as bit shift signals. In another example, the signals 
SHIFTa-SHIFTn may be implemented as mult i -bit and/or single-bit 
signals in a serial or parallel configuration. 

The shift signals SHIFTa-SHIFTn may be used when adding 

the partial products (MULTIa-MULTIn) to form the result of the 

multiply operation. For example, 

AB 
*CD 
D.B 
A.D^ 
C.B- 
+ C.A^^ 
RESULT 

Where the arrows may represent logical shifts and the 
V' may indicate multiplication. The logical shifts SHIFTa-SHIFTn 
may be implemented as input signals to the carry chain adder, or to 
another appropriate type adder. The logical shifts SHIFTa-SHIFTn 
may indicate a power of a base (e.g., usually base 10) to shift the 
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partial products. Effectively, the shifts are generally replaced 
with 0 when the addition is done, so the result is: 
RESULT = D.B + A. DO + C.BO + C.AOO 

The adder block 104 may also have an output 12 0 that may 
present a signal (e.g., OUT) to an input 122 of the result block 
106. The signal OUT may be implemented as an addition result 
signal. In one example, the signal OUT may be implemented as a 32- 
bit result signal. In one example, the signal OUT may be 
implemented as a multi-bit signal. However, the signal OUT may be 
implemented as a multi-bit and/or single-bit in a parallel or 
serial configuration signal. The signal OUT may be generated in 
response to the signals RESULTa-RESULTn. Additionally, the signal 
OUT may be generated in response to the signals SHIFTa-SHIFTn . 

The adder 104 may be implemented to shift (e.g., the 
signals SHIFTa-SHIFTn) and sum the partial products RESULTa- 
RESULTn. In one example, the adder 104 may comprise a sequence of 
8-bit carry chains. In another example, the adder 104 may comprise 
a sequence of varying bit width carry chains. In another example, 
the adder 104 may be implemented as a 32 -bit adder. However, the 
adder 104 may implement another appropriate bit size and/or number 

8 
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of adders or carry chains in order to meet the criteria of a 

particular implementation. 

The LUTs 102a-102n may be connected by a routable 

interconnect (not shown) . The routable interconnect may link 
5 function blocks (e.g., the LUTs 102a-102n and the adder 104). 

Additionally, the routable interconnect may allow configuration of 

the LUTs 102a-102n. The multiple interconnect may allow 
H configuration of a bit width and/or depth of the LUTs 102a-102n. 

The signals SHlFTa-SHIFTn may allow addition of the partial 
Ifl products generated by the LUTs 102a-102n. 

|jj The circuit 100 may implement the LUTs 102a-102n in a 

y mult i -port memory. In one example, the LUTs 102a- 102n may be 
'^S. implemented in a quad port memory. In another example, the LUTs 
fj 102a-102n may be implemented in a dual port memory. The multi-port 
15 memory 100 may allow a user to provide an address (e.g., the 
signals INa-INn) to each port (e.g., the LUTs 102a-102n) and read 
out a partial product (e.g., the signals RESULTa-RESULTn) from each 
port on each clock cycle. The circuit 100 may allow the user to 
implement a single multi-port memory to provide functionality that 
2 0 would previously have required several single port memories, in a 
single clock cycle. The circuit 100 may provide considerable 

9 
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saving of memory resources. The circuit 100 may allow the user to 
program more logic into a given device. Additionally, the circuit 
100 may allow the user to implement a design in a smaller and 
cheaper device. 

5 Referring to FIG. 2, a block diagram of another preferred 

embodiment of the present invention is shown, marked with primed 
notation. The circuit 100' may be similar to the circuit 100. The 
M circuit 100' may illustrate a pipeline configuration of the circuit 
;;U 100. The circuit 100' may additionally comprises a number of 
III registers 150a- 150n, (where n is an integer) a number of registers 
yl 152a-152n (where n is an integer) and a register 154. A particular 
};!f number of registers 150a- 150n, 152a-150n and 154 may be varied in 
h4 order to meet the criteria of a particular implementation. In one 
yi example, the registers 152a-150n, 152a-152n and 154 may be 
15 implemented as pipeline register stages. However, other 

appropriate type registers and/or stages may be implemented in 
order to meet the criteria of a particular implementation. The 
registers 150a-150n, 152a-152n and 154 may be implemented to 
increase a throughput of the circuit 100'. However, the circuit 
20 100' may require extra cycles of latency. 
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Referring to FIG. 3, a timing diagram 300 illustrating an 
operation of the present invention is shown. The timing diagram 
3 00 may illustrate a multiplication operation of the present 
invention. The timing diagram 300 generally comprises a number of 
5 address portions 302a-302n, a number of data portions 304a-304n and 
a result portion 306. The address portions 302a-302n may 
correspond to a number of address signals ADDRESS_1-ADDRESS_4 , the 
data portions 3 04a-3 04n may correspond to a number of data signals 
DATA_1-DATA__4 and the result portion 306 may correspond to a result 

W signal RESULT. Additionally, a number of shift signals SHIFT_1- 
SHIFT_4 may correspond to the data signals DATA__1-DATA___4 , 

rjl respectively. 

The timing diagram 3 00 may illustrate an 8 -bit 

Cl implementation of the address portions 302a-302n and the data 

15 portions 304a-304n, Each of the addresses portions 302a-302n may 
comprise a first and a second operand (e.g., a multiplicand) . The 
first and second operands may be concatenated together in a single 
address portion 302a-302n. For example, the address portion 
3 02a (ADDRESS_1) may comprise an operand B and an operand D, The 

2 0 operand B may occupy an upper 4 -bits of the address portion 
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(ADDRESS_1) and the operand D may occupy a lower 4 -bits of the 

address portion 3 02a (ADDRESS_1) . 

A read signal (e.g., READ) may attempt to read the 

address signals ADDRESS_1-ADDRESS_4 . Additionally, the read signal 
5 READ may assume that registers exist at an address input to each of 

the look-up-tables 102a-102n of the circuit 100 (e.g., as described 

with respect to FIG. 2) , The read signal READ may have a delay 
j|| (e.g., Tsu) . The circuit 100 may look up (via the LUTs 102a-102n) 
5^1 a respective multiplication value in response to the signals 
i§ ADDRESS__1-ADDRESS_4. The LUTs 102a-102n may store the respective 

multiplication values in the data portions 3 04a-3 04n (DATA_1- 
j|t DATA_4) . Each data portion 304a-304n (DATA^1-DATA^4) may have a 

respective shift value (e.g., the signals SHIFT_1-SHIFT__4) . 

Additionally, the data portions 304a-304n (DATA_1-DATA_4) may 
15 represent a product of the respective operands. For example, the 

data portion 3 04a may represent the product of B * D. 

Additionally, the data portions 304a-304n may have a delay (e.g.. 

Too) . 

A signal (e.g., ADD) may represent an addition of the 
20 data signals DATA_1-DATA_4 and the shift signals SHIFT_1_SHIFT_4 . 

The signal ADD may assume that registers exist between the LUTs 

12 
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102a-102n and the adder 104 (e.g., as described with respect to 
FIG. 2) . Since the adder may have registers, the data signals 
DATA__1-DATA_4 may not be fully asynchronous. The signal ADD (e.g., 
the function add via the adder 104) may have a delay equivalent to 
5 the delay Tsu. A signal (e.g., RESULT) may correspond to the final 
result of the addition of the data values DATA__1-DATA_4 . The 
signal RESULT (e.g., the result block 106) may have a delay that 
^1 i^^y be equivalent to the delay Tco and a delay (e.g.. Tec). The 
p delay Tec may be implemented as a time delay through carry chains 
3|;l| or other appropriate arithmetic devices. The time delay Tec may 

vary in response to a particular implementation of circuit 100. 
=1 The circuit 100 may allow a single multi-port memory to 

i;S be implemented to generate several partial products per clock 
cycle. The circuit 100 may allow fewer LUTs to be implemented in 
15 order to generate a given number of partial products. The circuit 
100 may provide efficient utilization of resources for a 
programmable device. The circuit 100 may provide faster 
performance for look-up-tables. Additionally, the circuit 100 may 
allow a designer to implement a design in a smaller and cheaper 
2 0 device. 
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While the invention has been particularly shown and 
described with reference to the preferred embodiments thereof, it 
will be understood by those skilled in the art that various changes 
in form and details may be made without departing from the spirit 
and scope of the invention. 
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CLAIMS 

1- An apparatus comprising: 

one or more look-up-tables (LUTs) configured to provide 
logical functions, wherein said one or more LUTs are implemented 
within a multiport memory. 

2. The apparatus according to claim 1, wherein said 
multiport memory comprises a dual port memory. 

3. The apparatus according to claim 1, wherein said 
multiport memory comprises a quad port memory. 

4- The apparatus according to claim 1, wherein said 
multiport memory is selected from a group consisting of a RAM, a 
ROM, a PROM, an EPROM, an EEPROM, a flash memory and other 
appropriate types of memories. 

5. The apparatus according to claim 1, wherein each of 
said one or more LUTs is configured to receive one or more inputs. 
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6. The apparatus according to claim 5, wherein each of 
said one or more inputs comprise single-bit or mult i -bit input in 
a serial or parallel configuration. 

7. The apparatus according to claim 5, wherein each of 
said one or more LUTs is further configured to generate a partial 
product signal . 

8. The apparatus according to claim 8, wherein each of 
said one or more LUTs is further configured to present said partial 
product signal in response to said one or more inputs. 

9. The apparatus according to claim 8, further 
comprising an adder circuit configured to receive said one or more 
partial product signals and present an output. 

10. The apparatus according to claim 9, wherein said 
adder is further configured to present said output in response to 
one or more second signals. 
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11. The apparatus according to claim 10, further 
comprising a rentable interconnect. 

12. The apparatus according to claim 1, further 
comprising : 

one or more register configured to increase a throughput 
of said one or more look-up-tables, 

13. The apparatus according to claim 1, wherein said 
logical functions comprise arithmetic functions and other logic 
functions . 

14. An apparatus comprising: 

means for providing one or more look-up-table (LUTs) in 
a multiport memory; and 

means for providing one or more logical functions, in 
response to said one or more LUTs, to at least one port of said 
multiport memory. 



15. A method for implementing logical functions, 
comprising the steps of: 

17 
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(A) providing one or more look-up-tables (LUTs) in a 
multiport memory; and 

(B) providing one or more logical functions, in response 
to said one or more LUTs, to at least one port of said multiport 
memory , 

16. The method according to claim 14, wherein said 
multiport memory comprises a dual port or quad port memory. 

17. The method according to claim 14, further 
comprising : 

receiving one or more input signals, each comprising a 
single-bit or multi-bit input in a serial or parallel 
conf igurat i on . 

18. The method according to claim 14, wherein said 
memory is selected from a group consisting of a RAM, a ROM, a PROM, 
an EPROM, an EEPROM, a flash memory and other appropriate types of 
memory . 

18 
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19. The method according to claim 14, further comprising 
the steps of: 

(C) presenting one or more partial product signals; and 

(D) adding said one or more partial product signals. 



20, The method according to claim 14, wherein step (D) 
is further configured in response to one or more shift signals. 
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ABSTRACT OF THE DISCLOSURE 

An apparatus comprising one or more look-up-tables 
(LUTs) . The LUTs may be configured to provide logical functions. 
The one or more LUTs are generally implemented within a multiport 
memory . 
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